Clustering Images with Multinomial Mixture Models
نویسندگان
چکیده
In this paper, we propose a method for image clustering using multinomial mixture models. The mixture of multinomial distributions, often called multinomial mixture, is a probabilistic model mainly used for text mining. The effectiveness of multinomial distribution for text mining originates from the fact that words can be regarded as independently generated in the first approximation. In this paper, we apply multinomial distribution to image clustering. We regard each color as a “word” and color histograms as “term frequency” distributions. Recent work provides an evidence that color histograms are quite non-Gaussian. Therefore, we assume that there are no intrinsic dependencies between the frequencies of similar colors and adopt multinomial model. We can append other information, e.g. spatial information, because our model assumes no intrinsic dependencies among image features. Our experiment has compared multinomial mixture, Dirichlet mixture and kmeans. Dirichlet mixture, a Bayesian version of multinomial mixture, is applied to image clustering for the first time as far as we know. The results of our evaluation experiment demonstrate that multinomial mixture gives higher accuracies than k-means. Further, Dirichlet mixture provides a result comparable to the best result of multinomial mixture.
منابع مشابه
Opinion mining from twitter data using evolutionary multinomial mixture models
Image of an entity can be defined as a structured and dynamic representation which can be extracted from the opinions of a group of users or population. Automatic extraction of such an image has certain importance in political science and sociology related studies, e.g., when an extended inquiry from large-scale data is required. We study the images of two politically significant entities of Fr...
متن کاملTemporal Multinomial Mixture for Instance-Oriented Evolutionary Clustering
Evolutionary clustering aims at capturing the temporal evolution of clusters. This issue is particularly important in the context of social media data that are naturally temporally driven. In this paper, we propose a new probabilistic model-based evolutionary clustering technique. The Temporal Multinomial Mixture (TMM) is an extension of classical mixture model that optimizes feature co-occurre...
متن کاملIncremental Mixture Learning for Clustering Discrete Data
This paper elaborates on an efficient approach for clustering discrete data by incrementally building multinomial mixture models through likelihood maximization using the Expectation-Maximization (EM) algorithm. The method adds sequentially at each step a new multinomial component to a mixture model based on a combined scheme of global and local search in order to deal with the initialization p...
متن کاملA Probabilistic Hierarchical Clustering Method for Organising Collections of Text Documents
In this paper a generic probabilistic framework for the unsupervised hierarchical clustering of large-scale sparse high-dimensional data collections is proposed. The framework is based on a hierarchical probabilistic mixture methodology. Two classes of models emerge from the analysis and these have been termed as symmetric and asymmetric models. For text data specifically both asymmetric and sy...
متن کاملProbabilistic Hierarchical Clustering Method for Organizing Collections of Text Documents
In this paper a generic probabilistic framework for the unsupervised hierarchical clustering of large-scale sparse high-dimensional data collections is proposed. The framework is based on a hierarchical probabilistic mixture methodology. Two classes of models emerge from the analysis and these have been termed as symmetric and asymmetric models. For text data specifically both asymmetric and sy...
متن کامل